NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fully First-Order Methods for Linearly Constrained Bilevel Optimization

Kornowski, Guy; Padmanabhan, Swati; Wang, Kai; Zhang, Zhe; Sra, Suvrit (December 2024, NeurIPS 2024)

Full Text Available
Transformers learn to implement preconditioned gradient descent for in-context learning

Ahn, Kwangjun; Cheng, Xiang; Daneshmand, Hadi; Sra, Suvrit (December 2023, Advances in neural information processing systems)

Full Text Available
CCCP is Frank-Wolfe in disguise

Yurtsever, Alp; Sra, Suvrit (December 2022, Advances in Neural Information Processing Systems)

Full Text Available
Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Cheng, Xiang; Zhang, Jingzhao; Sra, Suvrit (December 2022, Advances in Neural Information Processing Systems)

Full Text Available
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?

Tian, Yi; Zhang, Kaiqing; Tedrake, Russ; Sra, Suvrit (January 2023, Conference on Learning for Dynamics and Control)

We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particular, we focus on an intuitive cost-driven state representation learning method for solving Linear Quadratic Gaussian (LQG) control, one of the most fundamental partially observable control problems. As our main results, we establish finite-sample guarantees of finding a near-optimal state representation function and a near-optimal controller using the directly learned latent model. To the best of our knowledge, despite various empirical successes, prior to this work it was unclear if such a cost-driven latent model learner enjoys finite-sample guarantees. Our work underscores the value of predicting multi-step costs, an idea that is key to our theory, and notably also an idea that is known to be empirically valuable for learning state representations.
more » « less
Full Text Available
Understanding Riemannian acceleration via a proximal extragradient framework

Jin, Jikai; Sra, Suvrit (June 2022, PMLR)

Full Text Available
Understanding the unstable convergence of gradient descent

Ahn, Kwangjun; Zhang, Jingzhao; Sra, Suvrit (July 2022, PMLR)

Full Text Available
Understanding Nesterov's Acceleration via Proximal Point Method

https://doi.org/10.1137/1.9781611977066.9

Ahn, Kwangjun; Sra, Suvrit (April 2022, Symposium on Simplicity in Algorithms (SOSA))

Full Text Available
Max-Margin Contrastive Learning

https://doi.org/10.1609/aaai.v36i8.20796

Shah, Anshul; Sra, Suvrit; Chellappa, Rama; Cherian, Anoop (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

Standard contrastive learning approaches usually require a large number of negatives for effective unsupervised learning and often exhibit slow convergence. We suspect this behavior is due to the suboptimal selection of negatives used for offering contrast to the positives. We counter this difficulty by taking inspiration from support vector machines (SVMs) to present max-margin contrastive learning (MMCL). Our approach selects negatives as the sparse support vectors obtained via a quadratic optimization problem, and contrastiveness is enforced by maximizing the decision margin. As SVM optimization can be computationally demanding, especially in an end-to-end setting, we present simplifications that alleviate the computational burden. We validate our approach on standard vision benchmark datasets, demonstrating better performance in unsupervised representation learning over state-of-the-art, while having better empirical convergence properties.
more » « less
Full Text Available
Time Varying Regression with Hidden Linear Dynamics

Mania, Horia; Jadbabaie, Ali; Shah, Devavrat; Sra, Suvrit (May 2022, PMLR)

Full Text Available

« Prev Next »

Search for: All records